Feature selection

See:

AI/Supervised Learning/Regularized regression

AI/Feature learning

Resources

https://scikit-learn.org/stable/modules/feature_selection.html#tree-based-feature-selection
Random forest, extra trees. Feature importances with forests of trees
XGBoost, Feature importance and why it’s important:
- http://datawhatnow.com/feature-importance/
- http://machinelearningmastery.com/feature-importance-and-feature-selection-with-xgboost-in-python/
- Importance is calculated for a single decision tree by the amount that each attribute split point improves the performance measure, weighted by the number of observations the node is responsible for. The performance measure may be the purity (Gini index) used to select the split points or another more specific error function. The feature importances are then averaged across all of the the decision trees within the model.